skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Petrov, Dmitri"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Harris, Kelley (Ed.)
    Measuring the fitnesses of genetic variants is a fundamental objective in evolutionary biology. A standard approach for measuring microbial fitnesses in bulk involves labeling a library of genetic variants with unique sequence barcodes, competing the labeled strains in batch culture, and using deep sequencing to track changes in the barcode abundances over time. However, idiosyncratic properties of barcodes can induce nonuniform amplification or uneven sequencing coverage that causes some barcodes to be over- or under-represented in samples. This systematic bias can result in erroneous read count trajectories and misestimates of fitness. Here, we develop a computational method, named REBAR (Removing the Effects of Bias through Analysis of Residuals), for inferring the effects of barcode processing bias by leveraging the structure of systematic deviations in the data. We illustrate this approach by applying it to two independent data sets, and demonstrate that this method estimates and corrects for bias more accurately than standard proxies, such as GC-based corrections. REBAR mitigates bias and improves fitness estimates in high-throughput assays without introducing additional complexity to the experimental protocols, with potential applications in a range of experimental evolution and mutation screening contexts. 
    more » « less
  2. The long-term success of introduced populations depends on both their initial size and ability to compete against existing residents, but it remains unclear how these factors collectively shape colonization dynamics. Here, we investigate how initial population (propagule) size shapes the outcome of community coalescence by systematically mixing eight pairs of in vitro microbial communities at ratios that vary over six orders of magnitude, and we compare our results to neutral ecological theory. Although the composition of the resulting cocultures deviated substantially from neutral expectations, each coculture contained species whose relative abundance depended on propagule size even after ~40 generations of growth. Using a consumer–resource model, we show that this dose-dependent colonization can arise when resident and introduced species have high niche overlap and consume shared resources at similar rates. Strain isolates displayed longer-lasting dose dependence when introduced into diverse communities than in pairwise cocultures, consistent with our model’s prediction that propagule size should have larger, more persistent effects in diverse communities. Our model also successfully predicted that species with similar resource-utilization profiles, as inferred from growth in spent media and untargeted metabolomics, would show stronger dose dependence in pairwise coculture. This work demonstrates that transient, dose-dependent colonization dynamics can emerge from resource competition and exert long-term effects on the outcomes of community coalescence. 
    more » « less
    Free, publicly-accessible full text available March 18, 2026
  3. Abstract The phrase “survival of the fittest” has become an iconic descriptor of how natural selection works. And yet, precisely measuring fitness, even for single-celled microbial populations growing in controlled laboratory conditions, remains a challenge. While numerous methods exist to perform these measurements, including recently developed methods utilizing DNA barcodes, all methods are limited in their precision to differentiate strains with small fitness differences. In this study, we rule out some major sources of imprecision, but still find that fitness measurements vary substantially from replicate to replicate. Our data suggest that very subtle and difficult to avoid environmental differences between replicates create systematic variation across fitness measurements. We conclude by discussing how fitness measurements should be interpreted given their extreme environment dependence. This work was inspired by the scientific community who followed us and gave us tips as we live tweeted a high-replicate fitness measurement experiment at #1BigBatch. 
    more » « less
  4. Gossmann, Toni (Ed.)
    Abstract Spiders (Araneae) have a diverse spectrum of morphologies, behaviors, and physiologies. Attempts to understand the genomic-basis of this diversity are often hindered by their large, heterozygous, and AT-rich genomes with high repeat content resulting in highly fragmented, poor-quality assemblies. As a result, the key attributes of spider genomes, including gene family evolution, repeat content, and gene function, remain poorly understood. Here, we used Illumina and Dovetail Chicago technologies to sequence the genome of the long-jawed spider Tetragnatha kauaiensis, producing an assembly distributed along 3,925 scaffolds with an N50 of ∼2 Mb. Using comparative genomics tools, we explore genome evolution across available spider assemblies. Our findings suggest that the previously reported and vast genome size variation in spiders is linked to the different representation and number of transposable elements. Using statistical tools to uncover gene-family level evolution, we find expansions associated with the sensory perception of taste, immunity, and metabolism. In addition, we report strikingly different histories of chemosensory, venom, and silk gene families, with the first two evolving much earlier, affected by the ancestral whole genome duplication in Arachnopulmonata (∼450 Ma) and exhibiting higher numbers. Together, our findings reveal that spider genomes are highly variable and that genomic novelty may have been driven by the burst of an ancient whole genome duplication, followed by gene family and transposable element expansion. 
    more » « less
  5. To advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel among Drosophila melanogaster populations. Specifically, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identified parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. Seasonally fluctuating polymorphisms are enriched in large chromosomal inversions, and we find a broad concordance between seasonal and spatial allele frequency change. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. Our results suggest that fluctuating selection is an important evolutionary force affecting patterns of genetic variation in Drosophila . 
    more » « less
  6. Nielsen, Rasmus (Ed.)
    Abstract Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome data sets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate data sets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in >20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This data set, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental metadata. A web-based genome browser and web portal provide easy access to the SNP data set. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan data set. Our resource will enable population geneticists to analyze spatiotemporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail. 
    more » « less